Parsing with Context - Free Grammars and WordStatistics
نویسنده
چکیده
We present a language model in which the probability of a sentence is the sum of the individual parse probabilities, and these are calculated using a probabilistic context-free grammar (PCFG) plus statistics on individual words and how they t into parses. We have used the model to improve syntactic disambiguation. After training on Wall Street Journal (WSJ) text we tested on about 200 WSJ sentence restricted to the 5400 most common words from our training. We observed a 41% reduction in bracket-crossing errors compared to the performance of our PCFG without the use of the word statistics.
منابع مشابه
Parsing Non-Recursive Context-Free Grammars
We consider the problem of parsing non-recursive context-free grammars, i.e., context-free grammars that generate finite languages. In natural language processing, this problem arises in several areas of application, including natural language generation, speech recognition and machine translation. We present two tabular algorithms for parsing of non-recursive context-free grammars, and show th...
متن کاملGeneralized LR Parsing for Grammars with Contexts
The Generalized LR parsing algorithm for context-free grammars is notable for having a decent worst-case running time (cubic in the length of the input string), as well as much better performance on “good” grammars. This paper extends the Generalized LR algorithm to the case of “grammars with left contexts” (M. Barash, A. Okhotin, “An extension of context-free grammars with one-sided context sp...
متن کاملA New Method for Dependent Parsing
Dependent grammars extend context-free grammars by allowing semantic values to be bound to variables and used to constrain parsing. Dependent grammars can cleanly specify common features that cannot be handled by context-free grammars, such as length fields in data formats and significant indentation in programming languages. Few parser generators support dependent parsing, however. To address ...
متن کاملPartially Ordered Multiset Context-free Grammars and Free-word-order Parsing
We present a new formalism, partially ordered multiset context-free grammars (pomsCFG), along with an Earley-style parsing algorithm. The formalism, which can be thought of as a generalization of context-free grammars with partially ordered right-hand sides, is of interest in its own right, and also as infrastructure for obtaining tighter complexity bounds for more expressive context-free forma...
متن کاملSimple, Efficient, Sound and Complete Combinator Parsing for All Context-Free Grammars, Using an Oracle
Parsers for context-free grammars can be implemented directly and naturally in a functional style known as “combinator parsing”, using recursion following the structure of the grammar rules. Traditionally parser combinators have struggled to handle all features of context-free grammars, such as left recursion. Previous work introduced novel parser combinators that could be used to parse all con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995